Trapezoid Distribution (trapezoid)#

The trapezoid distribution is a simple, flexible bounded continuous distribution whose density rises linearly, stays flat, and then falls linearly.

It’s a useful middle ground between:

  • Uniform (completely flat on an interval), and

  • Triangular (purely linear up then down).

What you’ll learn#

  • a clean parameterization and how it maps to scipy.stats.trapezoid

  • the PDF/CDF (with LaTeX)

  • moments (mean/variance/skewness/kurtosis), MGF/characteristic function, and entropy

  • derivations for expectation, variance, and the likelihood

  • NumPy-only sampling (inverse transform / piecewise construction)

  • visualization patterns and practical statistical use cases

Notebook roadmap#

  1. Title & classification

  2. Intuition & motivation

  3. Formal definition (PDF/CDF)

  4. Moments & properties

  5. Parameter interpretation

  6. Derivations (expectation, variance, likelihood)

  7. Sampling & simulation (NumPy-only)

  8. Visualization (PDF/CDF/samples)

  9. SciPy integration (scipy.stats.trapezoid)

  10. Statistical use cases

  11. Pitfalls

  12. Summary

import math

import numpy as np
import scipy
from scipy import stats

import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

pio.templates.default = "plotly_white"
# Plotly rendering (CKC convention)
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

SEED = 7
rng = np.random.default_rng(SEED)

np.set_printoptions(precision=6, suppress=True)

print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy  1.26.2
scipy  1.15.0
plotly 6.5.2

Prerequisites & notation#

Prerequisites

  • Comfort with PDFs/CDFs and expectation.

  • Basic calculus (integrals of polynomials and exponentials).

Notation

  • We’ll use the breakpoint parameterization with four real numbers (a \le b \le c \le d) and (a < d).

  • The support is ([a,d]).

Geometrically, the density looks like a trapezoid:

  • linearly increasing from (a) to (b)

  • flat from (b) to (c)

  • linearly decreasing from (c) to (d)

We’ll also show how to map ((a,b,c,d)) to SciPy’s ((c_{\text{shape}}, d_{\text{shape}}, \mathrm{loc}, \mathrm{scale})).

1) Title & Classification#

  • Name: trapezoid (Trapezoid distribution)

  • Type: continuous

  • Support: (x \in [a, d])

  • Parameter space (breakpoints):

    • (a < d)

    • (a \le b \le c \le d)

Special / limiting cases#

  • Uniform on ([a,d]): (b=a) and (c=d).

  • Triangular on ([a,d]): (b=c) (peak height occurs at (x=b=c)).

  • Right-triangular (peak at the right edge): (b=c=d).

  • Left-triangular (peak at the left edge): (a=b=c).

SciPy parameterization#

SciPy’s scipy.stats.trapezoid is defined on ([0,1]) with shape parameters (0 \le c_{\text{shape}} \le d_{\text{shape}} \le 1), and then shifted/scaled:

[ X = \mathrm{loc} + \mathrm{scale},Z,\qquad Z \sim \mathrm{trapezoid}(c_{\text{shape}}, d_{\text{shape}}),\quad Z\in[0,1]. ]

Mapping between parameterizations: [ \mathrm{loc}=a,\quad \mathrm{scale}=d-a,\quad c_{\text{shape}}=\frac{b-a}{d-a},\quad d_{\text{shape}}=\frac{c-a}{d-a}. ]

2) Intuition & Motivation#

What it models#

The trapezoid distribution models a bounded quantity where:

  • values in a central interval are roughly equally plausible (the plateau), and

  • plausibility tapers linearly toward the endpoints.

It’s a convenient way to encode “I’m pretty sure it’s between (b) and (c), but ([a,d]) is still possible”.

Typical real-world use cases#

  • Expert elicitation / priors: a bounded parameter with a “most plausible range.”

  • Sensor ranges: hard limits ([a,d]) with softened edges due to measurement fuzz.

  • Simulation / generative modeling: simple bounded noise that isn’t overly concentrated at a single point.

Relations to other distributions#

  • Uniform: no taper; all points equally likely.

  • Triangular: no plateau; linear up then down.

  • Beta: a smooth alternative on ([0,1]); trapezoid is the piecewise-linear alternative.

  • Truncated normal: another bounded model (after truncation), but typically more computationally involved.

3) Formal Definition#

Let (X) be a continuous random variable with breakpoints (a \le b \le c \le d), (a<d).

3.1 PDF#

The PDF is piecewise:

[ f(x) = \begin{cases} 0, & x < a,\ \displaystyle h,\frac{x-a}{b-a}, & a \le x < b,\ h, & b \le x \le c,\ \displaystyle h,\frac{d-x}{d-c}, & c < x \le d,\ 0, & x > d, \end{cases} ]

where the height (h) is chosen to normalize the density.

3.2 Normalization constant#

The area under the trapezoid is the sum of two triangles and a rectangle:

[ 1 = \frac{1}{2}(b-a)h + (c-b)h + \frac{1}{2}(d-c)h = h,\frac{c+d-a-b}{2}. ]

So [ h = \frac{2}{c+d-a-b}. ]

3.3 CDF#

The CDF (F(x)=\mathbb{P}(X\le x)) is:

[ F(x)=\begin{cases} 0, & x<a,\ \displaystyle \frac{h}{2(b-a)}(x-a)^2, & a\le x<b,\ \displaystyle h\Bigl(x-\frac{a+b}{2}\Bigr), & b\le x\le c,\ \displaystyle 1-\frac{h}{2(d-c)}(d-x)^2, & c<x\le d,\ 1, & x>d. \end{cases} ]

We’ll implement these in NumPy next.

from dataclasses import dataclass


@dataclass(frozen=True)
class TrapezoidParams:
    a: float
    b: float
    c: float
    d: float


def validate_trapezoid_params(a: float, b: float, c: float, d: float) -> None:
    if not (np.isfinite([a, b, c, d]).all()):
        raise ValueError("All parameters must be finite real numbers.")
    if not (a < d):
        raise ValueError("Require a < d (non-degenerate support).")
    if not (a <= b <= c <= d):
        raise ValueError("Require a <= b <= c <= d.")


def trapezoid_height(a: float, b: float, c: float, d: float) -> float:
    validate_trapezoid_params(a, b, c, d)
    return 2.0 / (c + d - a - b)


def trapezoid_pdf(x: np.ndarray, a: float, b: float, c: float, d: float) -> np.ndarray:
    '''Trapezoid PDF in breakpoint form (NumPy-only).'''
    validate_trapezoid_params(a, b, c, d)
    x = np.asarray(x, dtype=float)
    h = trapezoid_height(a, b, c, d)

    pdf = np.zeros_like(x)

    if b > a:
        left = (x >= a) & (x < b)
        pdf[left] = h * (x[left] - a) / (b - a)

    mid = (x >= b) & (x <= c)
    pdf[mid] = h

    if d > c:
        right = (x > c) & (x <= d)
        pdf[right] = h * (d - x[right]) / (d - c)

    return pdf


def trapezoid_cdf(x: np.ndarray, a: float, b: float, c: float, d: float) -> np.ndarray:
    '''Trapezoid CDF in breakpoint form (NumPy-only).'''
    validate_trapezoid_params(a, b, c, d)
    x = np.asarray(x, dtype=float)
    h = trapezoid_height(a, b, c, d)

    F = np.zeros_like(x)

    # a <= x < b
    if b > a:
        left = (x >= a) & (x < b)
        F[left] = (h / (2.0 * (b - a))) * (x[left] - a) ** 2

    # b <= x <= c
    mid = (x >= b) & (x <= c)
    F[mid] = h * (x[mid] - 0.5 * (a + b))

    # c < x <= d
    if d > c:
        right = (x > c) & (x <= d)
        F[right] = 1.0 - (h / (2.0 * (d - c))) * (d - x[right]) ** 2

    F = np.where(x > d, 1.0, F)
    return F


def trapezoid_ppf(u: np.ndarray, a: float, b: float, c: float, d: float) -> np.ndarray:
    '''Inverse CDF (percent-point function) using closed-form pieces.'''
    validate_trapezoid_params(a, b, c, d)
    u = np.asarray(u, dtype=float)
    if np.any((u < 0) | (u > 1)):
        raise ValueError("u must be in [0, 1].")

    den = c + d - a - b
    w1 = (b - a) / den
    w2 = 2.0 * (c - b) / den
    w3 = (d - c) / den

    t1 = w1
    t2 = w1 + w2

    x = np.empty_like(u)

    if w1 > 0:
        m1 = u < t1
        x[m1] = a + (b - a) * np.sqrt(u[m1] / w1)

    if w2 > 0:
        m2 = (u >= t1) & (u < t2)
        x[m2] = b + (c - b) * (u[m2] - w1) / w2

    # Remaining mass is the right triangle (or just u==1 when w3==0)
    m3 = u >= t2
    if w3 > 0:
        x[m3] = d - (d - c) * np.sqrt((1.0 - u[m3]) / w3)
    else:
        x[m3] = d

    x = np.where(u == 0, a, x)
    x = np.where(u == 1, d, x)
    return x


def trapezoid_rvs(size, a: float, b: float, c: float, d: float, rng: np.random.Generator | None = None) -> np.ndarray:
    '''Random variates via inverse transform sampling (NumPy-only).'''
    rng = np.random.default_rng() if rng is None else rng
    u = rng.random(size)
    return trapezoid_ppf(u, a, b, c, d)


def trapezoid_raw_moment(n: int, a: float, b: float, c: float, d: float) -> float:
    '''Raw moment E[X^n] using analytic integration of the PDF pieces.'''
    if n < 0:
        raise ValueError("n must be non-negative.")

    validate_trapezoid_params(a, b, c, d)
    h = trapezoid_height(a, b, c, d)

    m = 0.0

    # Left ramp: h/(b-a) * ∫ x^n (x-a) dx
    if b > a:
        m += h / (b - a) * (
            (b ** (n + 2) - a ** (n + 2)) / (n + 2)
            - a * (b ** (n + 1) - a ** (n + 1)) / (n + 1)
        )

    # Plateau: h * ∫ x^n dx
    if c > b:
        m += h * (c ** (n + 1) - b ** (n + 1)) / (n + 1)

    # Right ramp: h/(d-c) * ∫ x^n (d-x) dx
    if d > c:
        m += h / (d - c) * (
            d * (d ** (n + 1) - c ** (n + 1)) / (n + 1)
            - (d ** (n + 2) - c ** (n + 2)) / (n + 2)
        )

    return float(m)


def trapezoid_moments(a: float, b: float, c: float, d: float) -> dict[str, float]:
    '''Mean, variance, skewness, and excess kurtosis.'''
    m1 = trapezoid_raw_moment(1, a, b, c, d)
    m2 = trapezoid_raw_moment(2, a, b, c, d)
    m3 = trapezoid_raw_moment(3, a, b, c, d)
    m4 = trapezoid_raw_moment(4, a, b, c, d)

    var = m2 - m1**2
    if var <= 0:
        raise RuntimeError("Variance computed as non-positive; check parameters.")

    mu2 = var
    mu3 = m3 - 3 * m2 * m1 + 2 * m1**3
    mu4 = m4 - 4 * m3 * m1 + 6 * m2 * m1**2 - 3 * m1**4

    skew = mu3 / (mu2 ** 1.5)
    kurt_excess = mu4 / (mu2**2) - 3.0

    return {
        "mean": float(m1),
        "var": float(var),
        "skew": float(skew),
        "kurtosis_excess": float(kurt_excess),
    }


def trapezoid_entropy(a: float, b: float, c: float, d: float) -> float:
    '''Differential entropy H = -∫ f log f (closed form for this piecewise-linear PDF).'''
    validate_trapezoid_params(a, b, c, d)
    den = c + d - a - b
    h = 2.0 / den

    # Derived: H = -log(h) + h*((b-a)+(d-c))/4
    return float(-np.log(h) + h * ((b - a) + (d - c)) / 4.0)


def trapezoid_mgf(t: np.ndarray, a: float, b: float, c: float, d: float) -> np.ndarray:
    '''MGF M(t)=E[e^{tX}] (exists for all real t because support is bounded).'''
    validate_trapezoid_params(a, b, c, d)
    t = np.asarray(t)
    h = trapezoid_height(a, b, c, d)

    out = np.empty_like(t, dtype=np.complex128 if np.iscomplexobj(t) else float)

    near0 = np.isclose(t, 0)
    out[near0] = 1.0

    tt = t[~near0]
    res = 0.0

    # Left ramp
    if b > a:
        res = res + h / (b - a) * (
            np.exp(tt * b) * ((b - a) / tt - 1.0 / (tt**2))
            - np.exp(tt * a) * (0.0 / tt - 1.0 / (tt**2))
        )

    # Plateau
    if c > b:
        res = res + h * (np.exp(tt * c) - np.exp(tt * b)) / tt

    # Right ramp
    if d > c:
        res = res + h / (d - c) * (
            np.exp(tt * d) * (0.0 / tt + 1.0 / (tt**2))
            - np.exp(tt * c) * ((d - c) / tt + 1.0 / (tt**2))
        )

    out[~near0] = res
    return out


def trapezoid_cf(omega: np.ndarray, a: float, b: float, c: float, d: float) -> np.ndarray:
    '''Characteristic function φ(ω)=E[e^{iωX}] via MGF with t=iω.'''
    omega = np.asarray(omega, dtype=float)
    return trapezoid_mgf(1j * omega, a, b, c, d)


def to_scipy_params(a: float, b: float, c: float, d: float) -> tuple[float, float, float, float]:
    '''Convert (a,b,c,d) breakpoints to SciPy (c_shape, d_shape, loc, scale).'''
    validate_trapezoid_params(a, b, c, d)
    loc = a
    scale = d - a
    c_shape = (b - a) / scale
    d_shape = (c - a) / scale
    return float(c_shape), float(d_shape), float(loc), float(scale)


def from_scipy_params(c_shape: float, d_shape: float, loc: float = 0.0, scale: float = 1.0) -> TrapezoidParams:
    '''Convert SciPy parameters to breakpoints (a,b,c,d).'''
    if not (0 <= c_shape <= d_shape <= 1):
        raise ValueError("Require 0 <= c_shape <= d_shape <= 1.")
    if not (scale > 0):
        raise ValueError("Require scale > 0.")

    a = loc
    d = loc + scale
    b = loc + scale * c_shape
    c = loc + scale * d_shape
    return TrapezoidParams(float(a), float(b), float(c), float(d))

4) Moments & Properties#

Because the support is bounded, all moments exist and the MGF exists for all real (t).

4.1 Mean and variance#

A very intuitive way to compute (\mathbb{E}[X]) is to view the PDF area as a left triangle + rectangle + right triangle.

Let [ \Delta = c+d-a-b,\qquad h = \frac{2}{\Delta}. ] The probability masses of the three pieces are: [ w_1 = \frac{1}{2}(b-a)h = \frac{b-a}{\Delta},\qquad w_2 = (c-b)h = \frac{2(c-b)}{\Delta},\qquad w_3 = \frac{1}{2}(d-c)h = \frac{d-c}{\Delta}. ]

The corresponding centroids (conditional means) are:

  • left triangle on ([a,b]): (\mathbb{E}[X\mid\text{left}] = \tfrac{a+2b}{3})

  • rectangle on ([b,c]): (\mathbb{E}[X\mid\text{mid}] = \tfrac{b+c}{2})

  • right triangle on ([c,d]): (\mathbb{E}[X\mid\text{right}] = \tfrac{2c+d}{3})

So [ \mathbb{E}[X] = w_1\frac{a+2b}{3} + w_2\frac{b+c}{2} + w_3\frac{2c+d}{3}. ]

For the second raw moment, one can similarly use known triangular/uniform moments: [ \mathbb{E}[X^2 \mid \text{left}] = \frac{a^2}{6} + \frac{ab}{3} + \frac{b^2}{2},\quad \mathbb{E}[X^2 \mid \text{mid}] = \frac{b^2+bc+c^2}{3},\quad \mathbb{E}[X^2 \mid \text{right}] = \frac{c^2}{2} + \frac{cd}{3} + \frac{d^2}{6}. ]

Then ( \mathrm{Var}(X) = \mathbb{E}[X^2] - \mathbb{E}[X]^2. )

4.2 Skewness and kurtosis#

Skewness and (excess) kurtosis come from central moments: [ \gamma_1 = \frac{\mu_3}{\mu_2^{3/2}},\qquad \gamma_2 = \frac{\mu_4}{\mu_2^2} - 3. ]

In code we compute raw moments (m_k=\mathbb{E}[X^k]) for (k=1,2,3,4) analytically and convert to central moments.

4.3 MGF and characteristic function#

The MGF is [ M(t)=\mathbb{E}[e^{tX}] = \int_a^d e^{tx} f(x),dx, ] and the characteristic function is (\varphi(\omega)=M(i\omega)).

4.4 Entropy#

The differential entropy is [ H(X) = -\int_a^d f(x)\log f(x),dx. ]

For this piecewise-linear PDF, it has a clean closed form: [ H(X) = -\log h + \frac{h}{4}\bigl((b-a)+(d-c)\bigr). ]

(And it correctly reduces to (\log(d-a)) for the uniform case.)

# Example parameter set
params = TrapezoidParams(a=0.0, b=0.2, c=0.7, d=1.0)
a, b, c, d = params.a, params.b, params.c, params.d

mom = trapezoid_moments(a, b, c, d)
H = trapezoid_entropy(a, b, c, d)

mom, H
({'mean': 0.47777777777777775,
  'var': 0.052283950617283886,
  'skew': 0.06384207577550371,
  'kurtosis_excess': -0.9776791202787192},
 -0.12101540578511419)
# Monte Carlo validation of moments
n = 500_000
x = trapezoid_rvs(n, a, b, c, d, rng=rng)

mc = {
    "mean": x.mean(),
    "var": x.var(),
    "skew": ((x - x.mean()) ** 3).mean() / (x.var() ** 1.5),
    "kurtosis_excess": ((x - x.mean()) ** 4).mean() / (x.var() ** 2) - 3,
}

mom, mc
({'mean': 0.47777777777777775,
  'var': 0.052283950617283886,
  'skew': 0.06384207577550371,
  'kurtosis_excess': -0.9776791202787192},
 {'mean': 0.47773031266121874,
  'var': 0.0521678953325731,
  'skew': 0.06165757434756212,
  'kurtosis_excess': -0.9794766229452243})
# Quick check: entropy via numerical integration
xx = np.linspace(a, d, 200_000)
pdf = trapezoid_pdf(xx, a, b, c, d)

# Avoid log(0) with a mask
mask = pdf > 0
H_num = -np.trapz(pdf[mask] * np.log(pdf[mask]), xx[mask])

H, H_num
(-0.12101540578511419, -0.12101540752540882)

5) Parameter Interpretation#

The four breakpoints determine the shape:

  • (a): left endpoint of the support

  • (d): right endpoint of the support

  • (b): where the density finishes its linear ramp up (start of the plateau)

  • (c): where the plateau ends and the density begins its linear ramp down

Shape changes#

  • Increasing (b) (holding (a,c,d) fixed) makes the left ramp longer → more mass pulled left.

  • Decreasing (c) (holding (a,b,d) fixed) makes the right ramp longer → more mass pulled right.

  • When (b\to c), the plateau disappears and you recover a triangular distribution.

  • When (b\to a) and (c\to d), both ramps vanish and you recover a uniform distribution.

# Visualizing how (b, c) change the shape while keeping support [a,d] fixed.

a0, d0 = 0.0, 1.0
configs = [
    (0.05, 0.95, "almost uniform"),
    (0.20, 0.80, "moderate plateau"),
    (0.45, 0.55, "almost triangular"),
    (0.50, 0.50, "triangular (b=c)"),
]

xgrid = np.linspace(a0, d0, 500)

fig = go.Figure()
for b0, c0, label in configs:
    y = trapezoid_pdf(xgrid, a0, b0, c0, d0)
    fig.add_trace(go.Scatter(x=xgrid, y=y, mode="lines", name=f"b={b0:.2f}, c={c0:.2f} ({label})"))

fig.update_layout(
    title="Trapezoid PDF shapes for different (b, c)",
    xaxis_title="x",
    yaxis_title="pdf(x)",
)
fig.show()

6) Derivations#

6.1 Expectation#

Split the expectation into the three regions and integrate:

[ \mathbb{E}[X] = \int_a^b x,h\frac{x-a}{b-a},dx + \int_b^c x,h,dx + \int_c^d x,h\frac{d-x}{d-c},dx. ]

A cleaner geometric shortcut is to treat the PDF area as a mixture of:

  • a right-triangular density on ([a,b])

  • a uniform density on ([b,c])

  • a left-triangular density on ([c,d])

with weights (w_1,w_2,w_3) given earlier.

6.2 Variance#

Compute the second raw moment (\mathbb{E}[X^2]) using the same partition (or by integrating (x^2 f(x)) piecewise), then ( \mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2. )

6.3 Likelihood#

Given i.i.d. data (x_1,\dots,x_n), the likelihood is [ L(a,b,c,d) = \prod_{i=1}^n f(x_i\mid a,b,c,d). ]

Assuming all samples lie in ([a,d]), the log-likelihood can be written as:

[ \ell = n\log h

  • \sum_{x_i\in[a,b)} \log(x_i-a) - n_{\text{left}}\log(b-a)

  • \sum_{x_i\in(c,d]} \log(d-x_i) - n_{\text{right}}\log(d-c). ]

(Points in the flat region contribute only (\log h).)

MLE note: maximizing this with respect to ((a,b,c,d)) is a constrained optimization problem and can be numerically delicate; SciPy’s .fit works in the ([0,1])-shape parameterization but still benefits from good initialization.

# Verify the geometric mean formula matches raw-moment mean.

def trapezoid_mean_geometric(a: float, b: float, c: float, d: float) -> float:
    den = c + d - a - b
    w1 = (b - a) / den
    w2 = 2.0 * (c - b) / den
    w3 = (d - c) / den

    m_left = (a + 2 * b) / 3.0
    m_mid = (b + c) / 2.0
    m_right = (2 * c + d) / 3.0

    return float(w1 * m_left + w2 * m_mid + w3 * m_right)


mean_geom = trapezoid_mean_geometric(a, b, c, d)
mean_raw = trapezoid_raw_moment(1, a, b, c, d)
mean_geom, mean_raw
(0.47777777777777775, 0.47777777777777775)
# Log-likelihood implementation (useful for fitting / hypothesis tests)

def trapezoid_loglik(x: np.ndarray, a: float, b: float, c: float, d: float) -> float:
    '''Log-likelihood for i.i.d. observations under the trapezoid distribution.'''
    validate_trapezoid_params(a, b, c, d)
    x = np.asarray(x, dtype=float)

    # Outside support => likelihood 0
    if np.any((x < a) | (x > d)):
        return -np.inf

    h = trapezoid_height(a, b, c, d)
    ll = x.size * np.log(h)

    if b > a:
        left = (x >= a) & (x < b)
        ll += np.log(x[left] - a).sum() - left.sum() * np.log(b - a)

    if d > c:
        right = (x > c) & (x <= d)
        ll += np.log(d - x[right]).sum() - right.sum() * np.log(d - c)

    # Mid region contributes only log(h), already included.
    return float(ll)


# Demo: log-likelihood of simulated data under true parameters
x_small = trapezoid_rvs(2000, a, b, c, d, rng=rng)
trapezoid_loglik(x_small, a, b, c, d)
276.5668662649434

7) Sampling & Simulation (NumPy-only)#

Two clean ways to sample:

  1. Inverse transform: sample (U\sim\mathrm{Uniform}(0,1)) and compute (X=F^{-1}(U)).

    • For the trapezoid distribution, (F^{-1}) is available in closed form (square roots + linear pieces).

  2. Piecewise construction (mixture): the density is made of three simple components:

    • a right-triangular distribution on ([a,b])

    • a uniform distribution on ([b,c])

    • a left-triangular distribution on ([c,d])

    Sample which component you’re in using weights (w_1,w_2,w_3), then sample from the chosen component.

Our trapezoid_rvs uses the inverse-CDF implementation (trapezoid_ppf), which is vectorized and deterministic under a fixed RNG seed.

8) Visualization#

We’ll visualize:

  • the PDF and CDF on a grid

  • a Monte Carlo histogram overlaid with the PDF

# PDF / CDF plots
xgrid = np.linspace(a - 0.1 * (d - a), d + 0.1 * (d - a), 600)

pdf = trapezoid_pdf(xgrid, a, b, c, d)
cdf = trapezoid_cdf(xgrid, a, b, c, d)

fig_pdf = go.Figure()
fig_pdf.add_trace(go.Scatter(x=xgrid, y=pdf, mode="lines", name="pdf"))
for x0 in [a, b, c, d]:
    fig_pdf.add_vline(x=x0, line_width=1, line_dash="dot")
fig_pdf.update_layout(title="Trapezoid PDF", xaxis_title="x", yaxis_title="pdf(x)")
fig_pdf.show()

fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=xgrid, y=cdf, mode="lines", name="cdf"))
for x0 in [a, b, c, d]:
    fig_cdf.add_vline(x=x0, line_width=1, line_dash="dot")
fig_cdf.update_layout(title="Trapezoid CDF", xaxis_title="x", yaxis_title="F(x)")
fig_cdf.show()
# Monte Carlo histogram vs PDF
n = 50_000
samples = trapezoid_rvs(n, a, b, c, d, rng=rng)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=samples,
        nbinsx=80,
        histnorm="probability density",
        name="samples",
        opacity=0.6,
    )
)
fig.add_trace(go.Scatter(x=xgrid, y=pdf, mode="lines", name="pdf", line=dict(width=3)))
fig.update_layout(title="Samples vs PDF", xaxis_title="x", yaxis_title="density")
fig.show()

9) SciPy Integration (scipy.stats.trapezoid)#

SciPy’s implementation uses shape parameters (c_{\text{shape}}, d_{\text{shape}}) on ([0,1]), with loc and scale for shifting/scaling.

We’ll:

  • convert ((a,b,c,d)) to SciPy parameters

  • compare our NumPy PDF/CDF to SciPy’s

  • sample with .rvs

  • fit parameters with .fit (and show why initialization matters)

# Convert to SciPy parameterization and compare
c_shape, d_shape, loc, scale = to_scipy_params(a, b, c, d)

dist = stats.trapezoid(c_shape, d_shape, loc=loc, scale=scale)

pdf_scipy = dist.pdf(xgrid)
cdf_scipy = dist.cdf(xgrid)

max_pdf_err = np.max(np.abs(pdf_scipy - pdf))
max_cdf_err = np.max(np.abs(cdf_scipy - cdf))
max_pdf_err, max_cdf_err
(0.0, 1.1102230246251565e-16)
# SciPy sampling
s_scipy = dist.rvs(size=10_000, random_state=rng)

# Compare sample means
s_scipy.mean(), samples.mean()
(0.47476570557688047, 0.47767874741864774)
# SciPy fitting demo: the fit can be sensitive without good initialization.
true = dict(c=c_shape, d=d_shape, loc=loc, scale=scale)
data = stats.trapezoid.rvs(**true, size=3000, random_state=rng)

fit_unconstrained = stats.trapezoid.fit(data)

# Fix support endpoints (loc/scale) if they are known, and provide initial guesses for (c,d)
fit_fixed_support = stats.trapezoid.fit(data, 0.3, 0.6, floc=loc, fscale=scale)

fit_unconstrained, fit_fixed_support
((1.0, 1.0, -0.09811792447989266, 1.1809937986958925),
 (0.18285880192823223, 0.688053309454262, 0.0, 1.0))

10) Statistical Use Cases#

10.1 Hypothesis testing#

A common situation: you believe a variable is approximately uniform on ([a,d]), but want to test whether there’s evidence for a central plateau (a trapezoid).

  • Null: uniform on ([a,d]) (equivalently (b=a), (c=d) → SciPy (c_{\text{shape}}=0), (d_{\text{shape}}=1))

  • Alternative: trapezoid with free shape parameters (e.g. free (c_{\text{shape}}, d_{\text{shape}}) but fixed (\mathrm{loc},\mathrm{scale}))

You can compare log-likelihoods (likelihood ratio), or use a goodness-of-fit test (e.g. KS test) when parameters are fixed.

10.2 Bayesian modeling#

As a prior over a bounded parameter (\theta\in[a,d]), a trapezoid prior encodes:

  • a hard bound ([a,d])

  • a most plausible interval ([b,c])

  • linear tapering outside ([b,c])

This is not generally conjugate, but it’s easy to use in grid-based Bayes or MCMC.

10.3 Generative modeling#

The trapezoid distribution is a useful primitive in simulations:

  • bounded noise with a “flat core” and softened edges

  • simple distributions for latent variables constrained to an interval

# 10.1 Hypothesis test demo: uniform vs trapezoid (fixed support)
#
# Warning: if you estimate parameters from data, classical KS p-values are no longer exact.

# Generate data from a trapezoid alternative
loc0, scale0 = 0.0, 1.0
c_true, d_true = 0.2, 0.7
x_alt = stats.trapezoid.rvs(c_true, d_true, loc=loc0, scale=scale0, size=2000, random_state=rng)

# Log-likelihood under uniform on [0,1]
ll_uniform = np.sum(stats.uniform(loc=loc0, scale=scale0).logpdf(x_alt))

# Fit trapezoid with fixed support and decent initialization
c_hat, d_hat, loc_hat, scale_hat = stats.trapezoid.fit(x_alt, 0.3, 0.6, floc=loc0, fscale=scale0)
ll_trap = np.sum(stats.trapezoid.logpdf(x_alt, c_hat, d_hat, loc=loc0, scale=scale0))

(ll_uniform, ll_trap, c_hat, d_hat)
(0.0, 261.55753985773356, 0.2276372793264453, 0.6866876636087544)
# 10.2 Bayesian modeling demo: trapezoid prior + Normal likelihood (grid Bayes)

# Parameter to infer
prior_params = TrapezoidParams(a=0.0, b=0.3, c=0.7, d=1.0)

# Observation model: y | theta ~ Normal(theta, sigma^2)
sigma = 0.08
y_obs = 0.55

# Grid approximation for posterior
theta = np.linspace(prior_params.a, prior_params.d, 2000)
prior = trapezoid_pdf(theta, prior_params.a, prior_params.b, prior_params.c, prior_params.d)

lik = stats.norm(loc=theta, scale=sigma).pdf(y_obs)
post_unnorm = prior * lik
post = post_unnorm / np.trapz(post_unnorm, theta)

# Compare to a uniform prior on [a,d]
prior_u = stats.uniform(loc=prior_params.a, scale=prior_params.d - prior_params.a).pdf(theta)
post_u = (prior_u * lik)
post_u = post_u / np.trapz(post_u, theta)

fig = go.Figure()
fig.add_trace(go.Scatter(x=theta, y=prior / np.trapz(prior, theta), mode="lines", name="trapezoid prior"))
fig.add_trace(go.Scatter(x=theta, y=post, mode="lines", name="posterior (trapezoid prior)", line=dict(width=3)))
fig.add_trace(go.Scatter(x=theta, y=post_u, mode="lines", name="posterior (uniform prior)", line=dict(dash="dash")))
fig.add_vline(x=y_obs, line_dash="dot")
fig.update_layout(
    title="Grid Bayes: trapezoid prior vs uniform prior",
    xaxis_title="theta",
    yaxis_title="density",
)
fig.show()

11) Pitfalls#

  • Invalid parameters: always enforce (a < d) and (a \le b \le c \le d).

  • Degenerate cases:

    • (b=a) removes the left ramp; (c=d) removes the right ramp; (b=c) removes the plateau.

    • These are legitimate, but formulas that divide by ((b-a)) or ((d-c)) must branch.

  • Endpoint logs (likelihood): for continuous models, exact hits at (a) or (d) can create (\log(0)). In practice, treat such hits as numerical artifacts (or add small jitter) unless endpoints are truly observed with mass.

  • Fitting sensitivity: .fit can converge to boundary solutions without good initialization or constraints (fixing loc/scale can help a lot).

  • Numerical cancellation: for MGF near (t=0), use a special-case (we return 1 at (t=0)).

12) Summary#

  • The trapezoid distribution is a bounded, piecewise-linear PDF: ramp up → plateau → ramp down.

  • A convenient parameterization uses breakpoints ((a,b,c,d)) and height (h=2/(c+d-a-b)).

  • The CDF and inverse CDF are available in closed form, enabling fast NumPy-only sampling.

  • Mean/variance (and higher moments) can be derived by integrating piecewise or by decomposing into triangle + rectangle + triangle.

  • In SciPy, use scipy.stats.trapezoid(c_shape, d_shape, loc, scale) with a straightforward mapping from ((a,b,c,d)).